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INTRODUCTION 


Recently.  F.H.  Ling  and  G.  Schmidt  (ref  1)  argued  that  box-counting  techniques  are  superior 
to  the  Grassberger-Procaccia  (correlation  integral)  algorithm  (ref  2)  for  dealing  with  experimental 
data  sets. 

Reference  1  applied  a  box-counting  method  to  determine  the  capacity  dimension  D(0). 
information  dimension  D(l),  and  correlation  dimension  D(2)  of  the  attractors  of  the  Henon  map. 
the  logistic  map,  the  Lorentz  equation,  and  to  the  attractor  of  pulsar  0950  +08.  They  also  used  the 
conventional  Grassberger-Procaccia  method  (i.e.,  the  correlation  integral  method  specialized  to  the 
determination  of  the  correlation  dimension)  to  the  same  data  sets.  The  box-counting  algorithm  of 
Reference  1  is  similar  to  that  of  Block,  Bloh,  and  Schellnhuber  (ref  3)  in  that  only  data  concerning 
occupied  boxes  are  stored  and  analyzed. 

The  box-counting  and  correlation  integral  algorithms  of  Reference  1  yielded  frtTctal 
dimensions  within  about  3  percent  of  analytic  values  for  the  logistics  map  (using  a  1000  point 
sample).  Henon  map  (2000  points),  and  Lorentz  equation  (4000  points)  for  embedding  dimensions 
>1.2,  and  3.  respectively.  Restricted  point  sets  were  employed  to  reflect  the  fact  that  experimental 
point  sets  are  generally  limited  to  similar  ranges.  Reference  1  concluded  that  4000  points  are 
sufficient  for  the  determination  of  D(0),  D(l),  and  D(2)  of  the  fractal  attractor  of  pulsar  0950+08. 
The  pulsar  dimensions  have  values  near  5  and  were  determined  in  a  14  dimensional  embedding 
space. 


Reference  1  asserts  that  most  authors  use  the  correlation  dimension  as  the  "main  measure" 
of  a  strange  attractor.  They  also  conclude  that  box-counting  algorithms  require  less  computation 
time  and  yield  results  at  least  as  good  as  those  obtained  using  the  Grassberger-Procaccia  method  for 
the  correlation  dimension.  Therefore.  Ling  and  Schmidt  (ref  1)  conclude  that  box-counting  is 
superior  to  the  Grassberger-Procaccia  algorithm.  However,  they  also  note  that  the  Grassberger- 
Procaccia  algorithm  is  "superior”  to  box-counting  for  determining  the  capacity  dimension. 

Although  Ling  and  Schmidt  (ref  1)  are  particularly  interested  in  algorithmic  speed,  they  also 
address  a  very  important  issue  concerning  multifractal  measurement,  viz..  How  large  a  data  set  is 
needed  to  obtain  a  reasonable  approximation  to  the  fractal  measures  of  interest  in  a  given  case? 


It  may  well  be  the  case  that  for  highly  complex  chaotic  dynamical  systems  (such  as  that 
underlying  the  observations  of  pulsar  0950+08),  the  best  one  can  hope  to  do  is  to  determine  D(0). 
D(l),  and  D(2).  However,  there  are  numerous  cases  (for  example,  diffusion  limited  aggregation) 
where  the  determination  of  D(q)  for  general  q  is  important.  This  report  describes  a  systematic 
approach,  over  extensive  ranges  of  q,  to  the  question  of  convergence  for  box-counting  and 
correlation  integral-based  multifractal  analysis. 
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THEORY 


The  expression  for  the  box-counting  (Hentschell-Procaccia)  generalized  dimension  D(q)  in 
a  d  dimensional  topological  space  is 

liK^ /»♦(£)) 

iq -l)D(q)  »  UmU - - .  ^>*1, 

£-0  ln(J?) 

where  in  the  summation  i  runs  over  N(E)  occupied  d  dimensional  hypercubes  (boxes)  of  edge  length 
E  and  P,(E)  is  the  probability  of  finding  a  point  of  the  fractal  set  in  the  i'^  box.  As  implicitly  stated 
in  Eq.  (1)  of  Reference  1  and  demonstrated  for  a  number  of  cases  in  Reference  4,  box-counting 
algorithms  do  not  converge  for  q<0  in  many  cases.  In  practice,  one  deals  with  finite  subsets  of  the 

fractal  set  and  determines  a  numerical  approximation  to  (q-l)D(q)  by  fitting 

¥ 

ln(  ^  P*(£))  =  (q  -  l)D(9)In(£)  +  Const 
over  an  "appropriate"  range  of  E  values. 

The  generalized  correlation  integral  is  defined  as 

V  H 

C(q,E)  »  q*l 

.v-«  Af  4  N  j 

where  at  each  stage  in  the  limit  process  Xj  and  Xj  (d  dimensional  vectors)  run  over  the  N  element 
fractal  subset  and  H(x)  is  the  Heaviside  function.  The  Hentschell-Procaccia  fractal  dimension  D(q) 
is  determined  by 


(q  -  l)D(q)  =  limit 

ln(£) 

In  practice,  one  deals  with  finite  subsets  of  the  fractal  set  and  determines  a  numerical  approximation 
to  (q-l)D(q)  by  fitting  over  a  range  of  E.  etc.,  as  in  box  counting.  (Usually  a  Euclidean  metric  is 
a.ssumed  for  determining  |  Xj  -  Xj| .  Frequently,  the  outer  summation  in  the  expression  for  C(q.E)  is 
taken  over  a  subset,  a  reference  set,  of  the  fractal  subset.) 


RESULTS  AND  DISCUSSION 

Results  were  obtained  for  two  specific  numerical  algorithms: 

1.  The  agglomeration  box-counting  (ABC)  algorithm  (ref  4)  represents  box-counting 
algorithms.  The  results  of  standard  "sorting"  box-counting  algorithms  (for  relatively  smaller  ranges 
of  N)  at  all  q  were  consistent  with  the  results  obtained  via  ABC  for  the  models  reported  in 
Reference  4. 

2.  The  box-based  correlation  integral  (BBCI)  method  (ref  5)  provides  the  numerical 
realization  of  the  correlation  integral  method.  BBCI  converged  near  analytic  values  in  all  cases. 
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These  algorithms  are  well  suited  to  the  analysis  of  large  data  sets  and  are  therefore  well 
suited  to  convergence  studies.  ABC  and  BBCI  would  require  modifications  to  efficiently  handle 
relatively  sparse  data  in  high  dimensional  spaces. 

The  model  point  sets  studied  include  Euclidean  sets,  Koch  asymmetric  [0.4,0.2]  and 
symmetric  triadic  snowflakes  (ref  6),  split  snowflake  halls  (ref  7),  the  13  element  generator  Koch 
construction  (ref  8)  discussed  in  Reference  6,  and  the  attraaor  for  the  sixfold  (D6)  symmetric 
chaotic  mapping  of  Figure  3  of  Reference  9. 

The  algorithms  were  applied  to  identical  model  data  for  each  N.  The  multifractal  data  were 
stored  in  768x768  box  arrays,  which  simulate  image  acquisition  data.  The  initiators  of  the  Koch 
constructions  (ref  6)  were  randomly  oriented  with  respect  to  the  axes  of  the  box  array.  The  boxes, 
in  which  the  data  to  be  analyzed  were  stored,  were  designated  "elementary  boxes."  For  the  point 
sets  studied.  Reference  4  demonstrated  that  as  the  number  of  elementary  boxes  increases,  converged 
values  tend  to  be  closer  to  analytic  values,  but  the  number  of  points  required  for  convergence 
increases. 

Figures  1  through  5  present  a  selection  of  results  of  applying  the  box-counting  algorithm 
(ABC)  and  the  correlation  integral  algorithm  (BBCI)  for  specific  fractal  models  and  specific  q 
values.  The  graphs  display  measured  D(q)  versus  a  range  of  normalized  ln(N).  Each  graph  shows 
open  circles  representing  measured  values  connected  by  lines.  If  available,  the  analytic  value  of  D(q) 
is  shown  as  a  horizontal  line. 

Table  1  summarizes  convergence  results  for  the  five  model  fractal  sets.  We  refer  to 
"converged"  results  as  those  within  1  percent  of  the  values  obtained  at  the  largest  N  and  "good" 
results  as  those  within  5  percent.  Converged  values  may  differ  from  analytic  values  by  more  than 
1  percent.  Values  of  N  sufficient  to  obtain  converged  results  are  denoted  N,  and  good  results  N,. 
The  range  of  N  for  5  percent  (or  1  percent)  convergence  extends  below  the  value  of  N,  (or  N,)  in 
the  table.  It  falls  between  N;  (or  NJ  and  the  next  lower  N  point. 

The  definition  of  "good"  convergence  is  arbitrary.  Multifractal  measures  converged  within 
20  percent  might  be  important.  One  should  not  conclude  that  N,  is  a  minimum  number  of  points 
for  application  of  multifractal  analysis.  Table  1  or  the  figures  serve  as  guides  for  the  application  of 
box-counting  and/or  correlation  integral  algorithms. 

The  ABC  algorithm  converges  to  values  near  those  of  BBCI  at  all  q  for  the  Euclidean  point 
sets  and  the  sixfold  symmetric  chaotic  mapping;  ABC  diverges  for  q<0  for  the  other  fractal  point 
set". 


BBCI  overshoots  and  converges  from  above  for  the  triadic  snowflake  (monofractal)  and  the 
sixfold  symmetric  mapping.  All  other  convergent  cases  approach  their  limits  from  below. 

With  the  exception  of  the  monofractal  triadic  snowflake,  ABC  and  BBCI  converge  at  about 
the  same  rate  at  q>=0.  BBCI  requires  nearly  ten  times  as  many  points  at  q<0  than  at  q>=0  to 
obtain  good  results. 
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Where  analytic  measures  are  known,  BBC!  results  are  closer  than  ABC.  The  model  ftactal 
sets  having  D(0)  near  1.2  require  about  10^  points  to  yield  good  convergence  at  all  q.  Those  having 
D(0)  near  2.0  require  between  10^  and  10^  points  at  q^O  and  between  10^  and  about  10*  points  at 
q<0  for  good  convergence. 

Figure  6  shows  ABC  and  BBCI  cpu  time  versus  normalized  ln(N)  for  the  fractal  models. 
BBCI  cpu  time  exceeds  ABC  cpu  time  for  large  N,  but  it  is  lower  at  small  N.  The  curves  cross 
between  10^  and  10^  Both  of  the  present  algorithms  generally  yield  good  (5  percent)  convergence 
for  N  between  10^  and  10^  at  q>0. 

Cpu  time  tends  to  saturate  for  the  present  box-based  algorithms,  which  are  specialized  to 
deal  with  occupation  data  in  prescribed  arrays.  For  exact  coordinate  data  and  the  corresponding 
algorithms,  cpu  time  will  go  like  for  correlation  integral  methods  and  like  N  ln(N)  for  sorting- 
based  box-counting  algorithms. 

The  size  of  the  reference  set.  the  number  of  shells  in  the  fit,  and  the  number  of  elementary 
boxes  influence  BBCI  cpu  time.  Preliminary  investigations  suggest  that  a  reference  set  comprised 
of  25  percent  of  the  total  number  of  points  in  the  fractal  subsets  is  sufficient  to  obtain  equivalent 
results.  The  BBCI  cpu  times  in  Figure  6  were  obtained  using  100  percent  of  the  points  as  a 
reference  set  and  shell  diameters  ranging  from  3  to  49.  A  768x768  set  of  elementary  boxes  was  used 
for  both  ABC  and  BBCI. 
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Tabic  1.  Sudkicnt  Values  of  N  to  Yield  1  Perceat  (N,)  aud  S  Percent  (Nj)  Convergence. 
BoX'Counting  (ABC)  Docs  Not  Converge  for  q<0. 
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Figure  1.  Hcntschel!  and  Procaccia  generalized  dimension  D(q)  versus  normalized 
logarithm  of  the  number  of  points  in  the  fractal  subset  for  the 
asymmetric  (0.4,0.2]  triadic  snowflake. 

(a),  q  =  -25. 
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Figure  1(c).  q  =  0. 
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Figure  2.  Hentschell  and  Procaecia  generalized  dimension  D(q)  versus  normalized 
U)garithm  of  the  number  of  points  in  the  fractal  subset  for  the 
monofractal  Koch  triadic  snowflake. 

(a),  q  =  -25. 
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Figure  2(b).  q  =  -5. 
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Figure  2(c).  q  =  0. 
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Figure  2(cl).  q  =  5. 
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Figure  2(e).  q  =  25. 
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Figure  3(d).  q  =  5. 
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Figure  3(e).  q  =  25. 
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Figure  4.  Hentschell  and  Procaccia  generalized  dimension  D(q)  versus  normalized 
logarithm  of  the  number  of  points  in  the  fractal  subset  for  the  13 
element  generator  construction  (ref  8). 

(a),  q  =  -25. 
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Figure  4(b).  q  =  -5. 


Figure  4(c).  q  =  0. 
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1.7 


Figure  4(d).  q  =  5. 
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Figure  5.  Hentschell  and  Procaccia  generalized  dimension  D(q)  versus  normalized 
logarithm  of  the  number  of  points  in  the  fractal  subset  for  the  attractor 
for  the  symmetric  chaotic  mapping  of  Figure  3(a)  of  Reference  9. 

(a).q  =  -25. 
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Figure  5(b).  q  =  -5. 
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Figure  5(c).  q  =  0. 
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Figure  5(d).  q  =  5. 


Figure  5(e).  q  =  25. 
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Figure  6.  Cpu  time  versus  normalized  logarithm  of  the  number 
of  points  in  the  fractal  subsets. 

(a).  Asymmetric  (0.4,0.2J  triadic  snowflake. 
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Figure  6(b),  Koch  triadic  snowflake. 
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Figure  6(c).  Split  snowflake  halls. 
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Figure  6(e).  The  D6  symmetric  chaotic  mapping  of  Figure  3a  of  Reference  9. 
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